Near Optimal On-Policy Control

نویسندگان

  • Matthew Robards
  • Peter Sunehag
چکیده

We introduce two online gradient-based reinforcement learning algorithms with function approximation – one model based, and the other model free – for which we provide a regret analysis. Our regret analysis has the benefit that, unlike many other gradient based algorithm analyses for reinforcement learning with function approximation, it makes no probabilistic assumptions meaning that we need not assume a fixed behavior policy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

COMPARISON BETWEEN MINIMUM AND NEAR MINIMUM TIME OPTIMAL CONTROL OF A FLEXIBLE SLEWING SPACECRAFT

In this paper, a minimum and near-minimum time optimal control laws are developed and compared for a rigid space platform with flexible links during an orientating maneuver with large angle of rotation. The control commands are considered as typical bang-bang with multiple symmetrical switches, the time optimal control solution for the rigid-body mode is obtained as a bang-bang function and app...

متن کامل

Optimal Energy Management and Feasibility Study of a Hybrid Energy System for a Remote Area

This paper investigates impacts of possible chances in energy policy and consumption behavior on optimal energy management and feasibility study of a hybrid energy system. The study was performed on a remote area near Esfarjan, a village located in Shahreza, Iran. In the main scenario, the current energy policy is applied while the consumption behavior of customers is studied by means of an inc...

متن کامل

Near-Minimum Time Optimal Control of Flexible Spacecraft during Slewing Maneuver

The rapid growth of space utilization requires extensive construction, and maintenance of space structures and satellites in orbit. &#10This will, in turn, substantiate application of robotic systems in space. In this paper, a near-minimum-time optimal control law is developed for a rigid space platform with flexible links during an orientating maneuver with large angle of rotation. The time op...

متن کامل

Near-Minimum Time Optimal Control of Flexible Spacecraft during Slewing Maneuver

The rapid growth of space utilization requires extensive construction, and maintenance of space structures and satellites in orbit. This will, in turn, substantiate application of robotic systems in space. In this paper, a near-minimum-time optimal control law is developed for a rigid space platform with flexible links during an orientating maneuver with large angle of rotation. The time opti...

متن کامل

Near-Optimal Controls of a Fuel Cell Coupled with Reformer using Singular Perturbation methods

A singularly perturbed model is proposed for a system comprised of a PEM Fuel Cell(PEM-FC) with Natural Gas Hydrogen Reformer (NG-HR). This eighteenth order system is decomposedinto slow and fast lower order subsystems using singular perturbation techniques that provides tools forseparation and order reduction. Then, three different types of controllers, namely an optimal full-order,a near-opti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011